Toward island-of-reliability-driven very-large-vocabulary on-line handwriting recognition using character confidence scoring
نویسندگان
چکیده
We explore a novel approach for handwriting recognition tasks whose intrinsic vocabularies are too large to be applied directly as constraints during recognition. Our approach makes use of vocabulary constraints, and addresses the issue that some parts of words may be written more recognizably than others. An initial pass is made with an HMM recognizer, without vocabulary constraints, generating a lattice of character-hypothesis arcs representing likely segmentations of the handwriting signal. Arc confidence scores are computed using a posteriori probabilities. The most-confidently-recognized characters are used to filter the overall vocabulary, generating a word subset manageable for constraining a second recognition pass. With a vocabulary of 273,000 words, we can limit to 50,000 words in the second pass and eliminate 39.3% of the word errors made by a onepass recognizer without vocabulary constraints, and 18.3% of errors made using a fixed 30,000-word set.
منابع مشابه
Improved On-Line Handwriting Recognition Using Context Dependent Hidden Markov Models
This paper presents the introduction of context dependent Hidden Markov Models for cursive, uncon-strained handwriting recognition with large vocabularies. Since context dependent models were successfully introduced to speech recognition ((1], 2], 3]), it seems obvious, that the use of trigraphs could also lead to improved on-line handwriting recognition systems 4]. In analogy to triphones in s...
متن کاملConfidence-Scoring Post-Processing for Off-Line Handwritten-Character Recognition Verification
We apply confidence-scoring techniques to verify the output of an off-line handwritten-character recognizer. We evaluate a variety of scoring functions, including likelihood ratios and estimated posterior probabilities of correctness, in a post-processing mode, to generate confidence scores. Using the post-processor in conjunction with a neural-netbased recognizer, on mixed-case letters, receiv...
متن کاملAn Investigation of Context-dependent and Hybrid Modeling Techniques for Very Large Vocabulary On-line Cursive Handwriting Recognition
This paper addresses a very challenging topic in on-line handwriting recognition. It deals with the problem how to further improve a baseline very large vocabulary HMM-based handwriting recognition system with a vocabulary size of 200.000 German words. The use of sophisticated HMM-technology allows the construction of such a baseline system. It is however an extremely difficult task to further ...
متن کاملOn-Line Handwriting Recognition Using Hidden Markov Models
New global information-bearing features improved the modeling of individual letters, thus diminishing the error rate of an HMM-based on-line cursive handwriting recognition system. This system also demonstrated the ability to recognize on-line cursive handwriting in real time. The BYBLOS continuous speech recognition system, a hidden Markov model (HMM) based recognition system, is applied to on...
متن کاملAn Effective Character Separation Method for Online Cursive Uyghur Handwriting
There are many connected characters in cursive Uyghur handwriting, which makes the segmentation and recognition of Uyghur words very difficult. To enable large vocabulary Uyghur word recognition using character models, we propose a character separation method for over-segmentation in online cursive Uyghur handwriting. After removing delayed strokes from the handwritten words, potential breakpoi...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2001